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SYNTHETIC LEADER PEPTIDE SEQUENCES 



FIELD OF INVENTION 

The present invention relates to novel synthetic leader peptide sequences for secreting 
polypeptides in yeasL 

BACKGROUND OF THE INVENTION 

Yeast organisms produce a number of proteins which are synthesized intracellularly, 
but which have a function outside the cell. Such extracellular proteins are referred to as 
secreted proteins. These secreted proteins are expressed Initially Inside tiie cell in a 
precursor or a pre-protein fbrni containing a pre-peptide sequence ensuring effective 
direction of tfie expressed product (into tfie secretory patfiway of tfie cell) across tiie 
membrane of ttie endoplasmic reticulum (ER). The pre-sequence, nomnally named a 
signal peptide, is generally cleaved off from the desired product during translocation. 
Once entered in the secretory pathway, the protein is transported to the Golgi 
apparatus. From tiie Golgi the protein can follow different routes that lead to 
compartments such as the cell vacuole or the cell membrane, or it can be routed out of 
the cell to be secreted to tiie extemal medium XPfeffer, S.R. and Rotiiman, J.E. 
Ann.Rev.BiQchem. 56 (1987) 829^52). 

Several approaches have been suggested for the expression and secretion in yeast of 
proteins heterologous to yeasL European published patent application No. 88 632 
describes a process by which proteins heterologous to yeast are expressed, 
processed and secreted by transfonning a yeast organism witti an expression vehide 
hartDouring DNA encoding the desired protein and a signal peptide, preparing a culture 
of the transfomied organism, growing the culture and recovering the protein from the 
culture medium. The signal peptide may be tiie signal peptide of the desired protein 
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2 . . 

itself, a heterologous signal peptide, or a hybrid of native and heterologous signal 
peptides. 

A problem encountered with the use of signal peptides heterologous to yeast might be 
5 that the heterologous signal peptide does not ensure efRdent translocation and/or 
cleavage ot the precursor polypeptide after the signal peptide. 

The Saccharomyces cerevisiae MFa1 (a-factor) is synthesized as a pre-pno fomi of 
165 amino acids comprising signal- or pre-peptide of 19 amino acids follovired by a 
10 "leader" or pro-peptide of 64 amino acids, encompassing three N-linked glycosyiation 
sites followed by (LysArg((Asp/Glu)Ala)2^a-factor)4 (Kurjan, J. and Herskowitz, I. Cg|l 
30. (1982) 933-943). The sIgnaHeader part of the pre-pro MFal has been widely 
employed to obtain synttiesis and secretion of heterologous proteins in ^ cerevisiae, 

15 Use of signal/leader peptides homologous to yeast is known from i.a. US patent 
specification No. 4,546,082, European published patent applications Nos. 116 201, 
123 294, 123 544, 163 529 and 123 289 and DK patent application No. 3614/83. 

In EP 123 289 utilization of the cefevisiae a-factor precursor is described whereas 
20 WO 84/01 1 53 indicates utilization of tiie ^ cerevisiae invertase signal peptide and DK 
3614/83 utilization of tiie cerevisiae PH05 signal peptide for secretion of foreign 
proteins. 

US patent specification No. 4,546.082, EP 16 201. 123 294, 123 544 and 163 529 
25 describe processes by which ttie a-factor signal-leader from §. cerevisiae (MFal or 
MFa2) is utilized in tiie secretion process of expressed heterologous proteins in yeast 
By fusing a DNA sequence encoding ttie cerevisiae MFal signal/leader sequence 
at the 5' end of ttie gene for the desired protein secretion and processing of ttie desired 
protein was demonstrated. 



30 
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EP 206 783 discloses a system for the secretion of polypeptides from cerevisiae 
using an a-factor leader sequence which has been truncated to eliminate the four a- 
factor units present on the native leader sequence so as to leave the leader peptide 
itself fused to a heterologous polypeptide via the a-factor processing site 
5 LysArgGluMaGluAla. This construction is indicated to lead to an efRcient processing of 
smaller peptides (less than 50 amino acids). For the secretion and processing of larger 
polypeptides, the native a-factor leader sequence has been truncated to leave one or 
two of the a-foctor units between the leader peptide and the polypeptide. 



10 A number of secreted proteins are routed so as to be exposed to a proteolytic 
processing system which can cleave the peptide bond at the carboxy end of two 
consecutive basic amino acids. This enzymatic activity is in ^ cerevisiae encoded by 
the KEX 2 gene (Julius, DA et al., £sll ^ (1984b) 1075). Processing of the product 
by the KEX 2 protease is needed for the secretion of active ^ cerevisiae mating factor 

15 al (MFal or a-fador) whereas KEX 2 is not involved in ttie secretion of active S* 
cerevisiae mating factor a. 

Secretion and correct processing of a polypeptide intended to be secreted is obtained 
in some cases when culturing a yeast organism which is transformed with a vector 

20 constructed as indicated in the references given above. In many cases, however, the 
level of secretion is very low or there is no secretion, or the proteolytic processing may 
be inconrect or incomplete resulting in secretion of a considerable amount of leader 
bound product polypeptide. Prosequences, and especially N-temiinally located 
prosequences, or leader sequences expressed in eucaryotic cells, such as yeast cells, 

25 . are extensively glycosylated, cf. Fiedler and Simons, Cell, 81. p 309-312; and Moir, 
D.T., Yeast mutants with increased secretion efficiency, in Yeast Genetic Engineering, 
Barr, P. J., Brake, A. J., and Valenzuela, P. eds., wherein a general review of 
glycosylation and secretion of proteins is presented. It is generally recognised that 
glycos^ation, which may be either N-linked, 0-linked, or both, is important for efficient 

30 transport through the secretory pathway, cf. Caplan et ah, Journal of Bacteriology. Vol. 
173, No.2, p. 627-635; and Jars et al., The Journal of Biological Chemistry, Vol. 270, 
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No. 42, p 24810-24817. Moreover, due to the extensive glycosylation the purification of 
secreted propeptides is difficult and differs considerably from the processing steps that 
are typically employed for the purification of the mature secreted polypeptide. 
Clements et al.. Gene, 106 (1991) 267-272, have shown that using a eucaryotic 
5 consensus signal sequence and two 19-aa pro-sequences comprising fractions of the 
a-Factor leader and identical except for the presence or absence of a potential Asn 
linked (N-linked) glycosylation site for secretion of hEGF from yeast had no effect on 
secretion, and the level of secretion was comparable to the level obtained when using 
the a-Factor prepro-sequence (about 3^g/ml). 

10 

Expression of heterologous proteins as fusion proteins is a well known concept and 
has been utilized in various contexts in different organisms. Secretory expression of a 
heterologous protein in yeast is often perfomied as a fusion protein with a secretion 
prepro-leader to confer secretion competence. Prepro-leaders tend to be 

15 hyperglycosylated or extensively 0-Iinked glycosylated in the S. ceredsiae secretory 
pathway. Purification of hyperglycosylated fusion protein Is laborious due to its 
heterogeneous nature. Efficient prepro-leaders lacking hyperglycosylation, with no or 
limited Olinked glycosylation and replacement of the dibasic Kex2 endoprotease site 
with a more convenient enzymatic processing site, provide an altemative to 

20 conventional yeast expression by purificatton of the fusion protein and subsequently in 
vitro maturation with a suitable enzyme as exemplified herein for the insulin precursor. 
In vitro maturation of a purified fusion protein is more flexible since dependency on the 
Kex2 endoprotease is eliminated and any proteolytic enzyme can be used for 
maturation provMed that the heterologous protein does not have any internal 

25 processing sites. Purification of the fusion protein from the culture supematant followed 
by in vitro maturation will avoid N-temninal processing of the heterologous protein by 
dipeptidyl aminopeptidase. Secretion of a fusion protein rather than the heterologous 
protein has the advantage that the propeptide may increase stability and solubility until 
purification and maturation. Secrotory expression in yeast of heterologous proteins with 

30 intemal dibasic sites may lead to Kex2 endoprotease processing and a decrease In 
fermentation yield. This can be avoided by utilizing a secretion prepro-leader lacking N- 
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linked glycosylation to cx)nfer seaetion competence, introduction of a suitable enzyme 
processing site between the prepro-leader and the heterologous protein, expression in 
a Kex2 endoprotease negative S. cerevisiae strain followed by purification and in vitro 
maturation. 

5 

It is an object of the present invention to provide novel synthetic leader peptides or pro- 
sequences which ensure a higher yield and a more efficient recovery and/or 
. processing of polypeptides, preferably secreted polypeptides, including leader bound 
polypeptides, and polypeptides being fused N-terminally to peptide sequences 
10 including leader sequences and/or spacer sequences each of which optionally being 
separated from the other constituent sequences by a processing site, expressed in a 
eucaryotic host cell organism, preferably a fungal cell, such as a yeast cell or a 
filamentous fungus cell. 

15 

SUMMARY OF THE INVENTION 

A novel type of synthetic leader peptide has been found which allows secretion in high 
yield and/or Improved recovery of a polypeptide produced in yeast 

20 

Accordingly, the present invention relates to a DNA construct encoding a polypeptide 
and having the structure SP-LP-(PS)-(SHPSKgene*, wherein SP is a DNA 
sequence (presequence) encoding a signal peptide, LP is a DNA sequence 
encoding a synthetic leader peptide (propeptide) wherein N-linlced glycosylation is 

25 lacking, PS is a DNA sequence encoding a protease processing site which Is 
optional, S Is a DNA sequence encoding a spacer peptide, and ^gene'^ is a DNA 
sequence encoding a polypeptide. The structure SP-LP-(PS)-(S)-(PS)-*gene* 
comprises the following stmctures, SP-LP-PS.S-PS-*gene*, SP-LP-PS-*gene*, SP- 
LP-PS-S-*gene*, SP-LP-S-*gene*, SP-LP-S-PS-*gene*, and SP-LP-*gene*; in 

30 structures containing more than one PS these may be the same or different. 
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Preferably, PS is a DNA sequence encoding a yeast protease processing site, such 
as an endopeptidase processing site, and LS is preferably a DNA sequence encoding 
a synthetic leader peptide or prepro-leader with the general fomnula I: 

5 Q/SPIDDTESQTTSVNLMADDTESA/RFATYTXLDWN/GL(ISMAy(^^ (I) 
wherein 

X is a codable amino acid or preferably a sequence of from 1 to 5 codable amino 
acids which may be the same or different, and is preferably selected from the group 
consisting of T,L,A,V,D,P,H,N,S,G, and Y is a codable amino acid selected ftx)m the 
10 group consisting of Q and N; the C-terminal KR is an optional dibasic processing 
site. 

More preferably, LS is a DNA sequence encoding a synthetic leader peptide with the 
general formula II: 

15 QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(^ 
VNLI(A/D)MAKR (II) 

wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine (S), 
oraspartic acid (D); the C-terminal KR is an optional dibasic processing site, 
or LS is a DNA sequence encoding a synthetic leader peptide with the general fomnula 
20 111: 

QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(A/D)Q(A/D)^ 
VNLI(A/D)MA (III) 

wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine (S), 
or aspartic acid (D). In formulas I and ii above, the C-terminal amino acids KR define a 
25 yeast processing site which is optional. 

In the present context, the expression "leader peptide" is understood to indicate a pro- 
peptide sequence whose function is to allow the expressed polypeptide product of 
*gene* optionally fused at its N-temiinal to a spacer peptide and/or a sequence of one 
30 or more amino acids defining a processing site, to be directed from the endoplasmic 
reticulum to the Golgi apparatus and further to a secretory vesicle for secretion into the 
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medium, (i.e. exportation of the expressed polypeptide across thjs cellular membrane 
and cell wall, if present, or at least through the cellular membrane into the periplasmic 
space of a cell having a cell wall). The tenri "synthetic" used in connection with leader 
peptides is intended to indicate that the leader peptide is one not found in nature, and, 

5 especially, the leader peptide sequences of the present invention do nqt include the 
factor leader sequence or fragments and constructs thereof such as the sequence 
QPVISTTVGSAAEGSLDKR, and a leader sequence derived from S. cere\^siae 
HSP150 protein having extensive 0-linked glycosylation, cf. Simonen, M., Vihinen, H., 
Jamsa, E., Arumae, U., Kalkkinen, N., and Makarow, M. (1996) The hsp150D-canier 

10 confers secretion competence to the rat nerve growth factor receptor ectodomain in 
Saccharomyces cerevisiae. Yeast 12, 457-466. Jamsa E ; Holkeri H ; Vihinen H ; 
Wikstrom M ; Simonen M ; Walse B ; Kalkkinen N ; Paakkola J ; and Makarow M 
(1995) Structural features of a polypeptide carrier promoting secretion of a beta- 
iactamase fusion protein in yeasL YEAST 11 ,1 381-91 . 

15 

The temn "signal peptide" is understood to mean a pre-sequence which is 
predominantly hydrophobic In nature and present as an N-terminal sequence of the 
precursor form of an extracellular protein, preferably when expressed In yeast. The 
function of the signal peptide Is to allow the expressed protein to be secreted to enter 
20 the endoplasmic reticulum. The signal peptide is normally cleaved off in the course of 
this process. The signal peptide may be heterologous or homologous to the organism 
producing the protein. 

The expression "polypeptide" is intended to indicate a heterologous polypeptide, i.e. a 
25 polypeptide or protein whteh is not produced by the host organism, preferably yeast, in 
nature as well as a homologous polypeptide, i.e. a polypeptide which is produced by 
the host organism, preferably a yeast, in nature and any preform thereof. In a preferred 
embodiment, ttie DNA constmct of the present invention encodes a heterologous 
polypeptide. 
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The expression "a oodable amino acid" is intended to indicate an amino acid which can 
be coded for by a triplet ("codon") of nucleotides. 

When, in the amino acid sequences given in the present specification, the one or three 
5 letter codes of two amino acids, separated by a slash, are given in brackets, e.g. (D/E). 
this is Intended to indicate that the sequence has either the one or the other of these 
amino acids in the pertinent position. 

The expression "heterologous protein" is intended to indicate a protein or polypeptide 
10 which is not produced by the host organism in nature, preferably the protein or 
polypeptide is heterologous in yeast 

The expression "spacer peptide" is intended to indicate an oligopeptide sequence of 
one or more amino acid residues, preferably 1 to 12 amino acid residues, more 
15 preferably about 4 to 6 amino acid residues, such as EEAEPK, EEGEPK. 

E(EA)3EPK, and EEPK, which may include a processing site, preferably situated N- 
terminally and/or C-terminally, 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention is further illustrated with reference to the appended drawings 
wherein 

Fig. 1 shows the expression plasmid pAK773 containing genes expressing the N- 
25 terminally extended polypeptides of the invention. In Fig. 1 the following 

symbols are used: TPI-PROMOTER: Denotes the TPI gene promoter 
sequence firom S. cerevisiae. 2: Denotes the region encoding a signal/leader 
peptide (e.g. from the YAP3 signal peptide and LA19 leader peptide in 
conjunction with the EEGEPK N-terminally extended MIS insulin precursor). 
30 TPI-TERMINATOR: Denotes TPI gene terminator sequence of S. cerevisiae. 

TPI-POMBE: Denotes TPI gene from S. pombe. Origin: Denotes a sequence 
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from S. cerevisiae 2 ^ plasmid including its origin of DNA replication In S. 
cerevislae. AMPrR: Sequence from pBR322 /pUC13 including the ampicillin 
. resistance gene and an origin of DNA replication In E. coll. 

5 Fig. 2 shows an example of a DNA sequence pAK855 (SEQ ID No. 1 ) encoding the 
YAPS signal peptide, a leader without potential N-linked glycosylation sites, the 
TA57 leader, and EEGEPK-MI3 insulin precursor complex. 

Fig. 3 shows an example of a DNA sequence (SEQ ID No. 2) encoding the YAPS 
10 signal peptide, a leader without potential N-linked glycosylation sites, the 

leader TA67, and MIS insulin precursor without N-terminally extension complex. 

Fig. 4 shows the expression piasmid pAK855 containing genes expressing the 
leader sequences of the Invention. 

IS 

Fig. 5 shows In vitro conversion of LA34/IP fuston protein by Achromobacter lyticus 
lysyl specific protease as a plot of the conversion of LA34/IP Hision protein by 
Sepharose-bound Achromobacter lyticus lysyl specific protease vs. time. A 
curve for a first order reaction with (pseudo-)equiiibrium is fitted to the data 
20 points. 

Fig. 6 shows mass spectrometry of In vitro maturation of purified LA34 prepro-leader 
insulin precursor (MIS) fiiston protein by Achromobacter lyticus lysyl specific 
endoprotease. 

25 



DETAILED DISCLOSURE OF THE INVENTION 



Preferred leader sequences of the invention are shown in Table 1 below. 

30 
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Table 1 



Strain 


leader 
No. 


Leader sequence 


SEQID 
No. 






OPinr^TP^<^TTQ\/MI MAnnTPQDCATOTTI Al n\A/MI ICMAITD 


o 


yMi\oOf 




npinnTPQOTTQ\/MI MAr^nTPCDCATnTDl Al r^\A/MI IQ^ilAl/'D 


4 


y/v\oOo 


TA**R 
1 AwO 


KdrlUU I COl«l 1 1 OVInLIViMIJIJ I toKrA 1 IN 1 iMALIJVVlNLIOlviAlvK 


5 




TACT 


r\DinrM^Qr>TTC\/Kii fciiAnoTxroACAX/^XMer^r^i rwA/r^i iohaai/d 
Ur lUU 1 coU 1 i oViMLIvlAiJU 1 coAr A 1 U 1 NoooLiJ VVoLioiviAKR 


6 


yAivobi 




UrlUU 1 toU 1 1 oVNLfviAUlJ itoArAI uTToVGoLDNA^GLIoMAKR 


7 




LAd4 


QPIDDTtoQTTSVN LMADDTcSRFATQT^ 


8 




TAoo 


QPIDUTcSQTTSVNLMADDTcoArATQTNSGG 


9 




TAlOl 


UKIUU 1 b5U 1 1 &>VNLMAUU 1 bSAPATQTNSGGLDWGLISMA 


10 




TAo7 


UKIUU I hbU 1 I bVNLMAUD rtSAFATQTTSVGGLDWGLPGAKR 


11 




TA68 


UHIUU I fcbU r Fb VNLMADDTESAFATQTPLALDWNLISMAKR 


12 




LA34 


QPIDDTESQTTSVNLMADDTESRFATQTTLALDWNLISMA 


13 




TA7fi 


nPinnTP^OTTWWI MAr^riTF^PPATOTTI P/^Ak'P 
V*<r lUU 1 COVxl I 1 O VI>iL.lviML/L/ 1 COrxrMI Mil Ur VPrvxrv 


14 




TA77 


QPIDDTESQTTSVNL^4ADDTESRALDWNLPGAKR 


15 




TA78 


QPIDDTESQTTSVNLMFATQTTLALDWNLP6AKR 


16 




TA79 


QPIDDTESQADDTESRFATQTTLALDWNLPGAWR 


17 




TA80 


QPTTSVNLMADDTESRFATQTTLALDWNLPGAKR 


18 




TA89 


QPIDDTESQTTSVNLMADDTESAFATQTNSGGLDWGNTTLISMAKR 


19 




TA90 


QPIDDTESQTTSVNLMADDTCSAFATQTNSGGLDWGUNTTMAKR 


20 



In the sequences of Table 1 the C-temiinal KR defines a dibasic protease processing 
site. 

5 

Further preferred leader sequences of the invention are shown in Table 1a and 1b 
below. 

Table la 



Leader 
No. 


Leader Sequence 


SEQ 
ID No. 


TA75 


QPIDD(A/D)E(A/DP(A/DXA/D)(A/D)VNLMADD(A/D)E(A/D)AFA(A/D)Q(A/D)PLAL 
DWNLISMA 


21 


TA75.50 


QPIDDAEAQAAAVNLMADDDEGFAAQAPLALDWNLISMA 


22 


TA75.15 


QPIDDAEAQDDDVNLMADDDGRFADQAPLALDWNLISMA 


23 


TA75.4 


QPIDDAEAQDAAVNLMADDGRLKIRFAAQAPLALDWNLISMA 


24 
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TA7S.51 


QPiDDAEDQAAAVNLMADDDEDGFAAQAPLALOWNLISMA 


25 


TA75.S8 


QPIDDAEAQDDDVNLMADDOGRFAAQAPLALDWNUSMA 


26 


TA75.64 


QPIDOAEAQDDDVNLMADODGRFAAQAPLALDWNUSMA 


27 




and any of the above where SMA is replaced by X^MA, wherein may be any 
codable amino acid, preferably hydrophHic amino adds 


28 



Table 1b 

5 



Leader No. 


Leader Sequence 


SEQ iO No. 


TA91 


QPTTSVNLMADDTESAFATQTNSGGLDWGLISMAKR 


29 


TA92 


QPIDDTESQADDTESAFATQTNSGGLDWGLISMAKR 


30 


TA93 


QPIDDTESQTTSVNLMFATQTNSGGLDWGLiSMAKR 


31 


TA94 


QPIDDTESQTTSVNLMADDTESAGGLDWGLISMAKR 


32 


TA95 


QPIDDTESQTTSVNLMADDTESAFATQTNSUSMAKR 


33 


TA96 


QPIDDTESQTTSVNLMADDTESAFATQTNSGGLMAKR 


34 


TA97 


QPIDDTESQTTSVNLMADDTESALISMAKR 


35 


TA98 


QPIDDTESQTTSVNLMLISMAKR 


36 



The heterologous protein or polypeptide produced by the method of the invention 
may be any protein which may advantageously be produced in yeast. Preferred 

10 examples of such proteins are aprotinin, tissue factor pathway inhibitor or other 
protease inhibitors, and insulin or insulin precursors, insulin analogues, insulin-like 
growth factors, such as IGF I and IGF II, human or bovine growth hormone, 
interieukin, tissue plasminogen activator, glucagon, glucagon-Iike peptide-1 (GLP 1), 
glucagon-like peptide-2 (GLP 2), GRPP, Factor VH, Factor VIII, Factor Xlll, platelet- 

15 derived growth factor, enzymes, such as lipases, or a functional analogue of any one 
of these proteins. More preferred proteins are precursors of insulin and insulirhiike 
growth factors, and especially the smaller peptides of the proglucagon family, such 
as glucagon, GLP 1, GLP 2, and GRPP, including truncated fomis, such as GLP-1(1- 
45), GLP.1{1-39). GLP-1(1-38), GLP-1(1.37). GLP-1(1-36), GLP.1(1-35), GLP-1(1.34), 
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GLP.1(7-45), GLP-1(7-39), GLP-1(7-38). GLP.1(7-37), GLP-1(7-36). GLP-1(7-35), and 
GLP-1(7-34). 

in the present context, the temi "functipnal analogue" is meant to indicate a 
5 polypeptide with a similar function as the native protein (this is intended to be 
understood as relating to the nature rather than the level of biological activity of the 
native protein). The polypeptide may be structurally similar to the native protein and 
may be derived from the native protein by addition of one or more amino acids to 
either or both the C- and N-temiinal end of the native protein, substitution of one or 
10 more amino acids at one or a number of different sites in the native amino acid 
sequence, deletion of one or more amino acids at either or both ends of the native 
. protein or at one or several sites in the amino acid sequence, or insertion of one or 
more amino acids at one or more sites in the native amino acid sequence. Such 
modifications are well known for several of the proteins mentioned above. 

15 

The precursors of insulin, including proinsulin as well as precursors having a truncated 
and/or modified C-peptide or completely lacking a C-peptide, precursors of insulin 
analogues, and insulin related peptides, such as insulin-like growUi factors, may be of 
human origin or from other animals and recombinant or semisynthetic sources. The 
20 cDNA used for expression of the precursors of insulin, precursors of insulin analogues, 
or insulin related peptkles in the method of the invention include codon optimised 
fornis for expression in yeast. 

By "a precursor of insulin" or "a precursor an insulin analogue" is to be understood a 
25 single-chain polypeptide which by one or more subsequent chemical and/or 
enzymatical processes can be converted to a two-chain insulin or insulin analogue 
molecule having the con-ect establishment of ttie three disulphide bridges as found in 
natural human insulin. Prefen^d insulin precursors are MM. B(1-29)-A(1-21); I\^I3, 
B(1.29)-Ala-Ala-Lys-A(1-21) (as described in e.g. EP 163 529); X14, B(1.27-Asp- 
30 Lys)-Ala-Ala-Lys-A(1.21) (as described in e.g. PCT publication No. 95/00550); B(1- 
27-Asp-Lys>-A(1-21); B(1-27-Asp-Lys)-Ser-Asp-Asp-Ala-Lys-A(1-21); B(1-29)-Ala. 
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AIa-Arg-A(1-21) (as described In e.g. PCT Publication No. 95/07931); MIS. B(1-29)- 
Ser-Asp-Asp-Ala-Lys-A(1-21); and B(1-29)-Ser-Asp-Asp-Ala-Arg-A(1-21), and more 
preferably MM, B^1.29)-A(1-21), MI3, B(1.29)-Ala-Ala-Lys-A(1-21) and MIS. B(1-29)- 
Ser-Asp-Asp-Ala-Lys-A(1 -21 ). 

5 

Examples of insulins or insulin analogues which can be produced in this way are 
human insulin, preferably des(B30) human insulin, porcine insulin; and Insulin 
analogues wherein at least one Lys or Arg is present, preferably insulin analogues 
wherein Phe^^ has been deleted, insulin analogues wherein the A-chain and/or the B- 

10 chain have an N-terminal extension and insulin analogues wherein the A-chain and/or 
the B-chain have a C-terminal extension. Other preferred insulin analogues are such 
wherein one or more of the amino acid residues, preferably one, two, or three of them, 
have been substituted by another codable amino acid residue. Thus, in position A21 a 
parent insulin may instead of Asn have an amino acid residue selected from the group 

15 comprising Ala. Gin, Glu. Gly. His. He, Leu. Met. Ser, Thr, Trp. Tyr or Val, in particular 
an amino acid residue selected from the group comprising Gly. Ala, Ser, and Thr. The 
Insulin analogues may also be modified by a combination of the changes outlined 
above. Likewise, in position B28 a parent insulin may instead of Pro have an amino 
add residue selected from the group comprising Asp and Lys, preferably Asp, and in 

20 position B29 a parent insulin may instead of Lys have the amino acid Pro. the 
expression "a codable amino acid residue" as used herein designates an amino acid 
residue which can be coded for by the genetic code, i. e. a triplet ("codon") of 
nucleotides. 

25 The signal sequence (SP) may encode any signal peptide which ensures an effective 
direction of the expressed polypeptide into the secretory pathway of the cell. The signal 
peptide may be a naturally occurring signal peptide or functional parts thereof or it may 
be a synthetic peptide. Suitable signal peptides have been found to be the a-factor 
signal peptide, the signal peptide of mouse salivary amylase, a modified 

30 cariDoxypeptidase signal peptide, the yeast BAR1 signal peptide or the Humicola 
lanuginosa lipase signal peptide or a derivative thereof. The mouse salivary amylase 
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signal sequence is described by HagenbQchle, O. et al., Nature 2aa (1981) 643-646. 
The carboxypeptidase signal sequence Is described by Vails, LA et al., fifill 4a (1987) 
887-897. The BAR1 signal peptide is disclosed in WO 87/02670. The yeast aspartic 
protease 3 signal peptide is described in Danish patent application No, 0828/93. 

5 

The yeast processing site encoded by the DNA sequence PS may suitably be any 
paired combination-of Lys and Arg, such as LysArg, ArgLys, ArgArg or LysLys which 
pemiits processing of the polypeptide by the KEX2 protease of Saccharomy r^g 
cerevislae or the equivalent protease in other yeast species (Julius, DA et al., fifiH 2Z 
10 (1984) 1075). If KEX2 processing is not convenient, e.g. if it would lead to cleavage of 
the polypeptide product, e.g. due to the presence of two consecutive basic amino acids 
internally in the desired product, a processing site for another protease may be 
selected comprising an amino acid combination which Is not found in the polypeptide 
product, e.g- the processing site for FXg, IleGluGlyArg (cf. Sambrook, J., Fritsch, E.F. 

15 and Manlatis, T., Molecular Cloning: A L aboratory Manual . Cold Spring Harbor 
Laboratory Press, New Yori<, 1989). 

Two of the preferred DNA constaicts encoding leader sequences are incorporated In 
SEQ ID Nos. 1 and 2 as shown in Fig. 2 codon 1078-1209, and Fig. 3 codon 1028- 

20 1206. or suitable modifications thereof. Examples of suitable modifications of the DNA 
sequence are nucleotide substitutions which do'not give rise to anoUier amino acid 
sequence of the protein, but which may correspond to the codon usage of the 
organism, preferably a fungal organism, such as a yeast, Into which the DNA constmct 
Is inserted or nucleotide substitutions which do give rise to a different amino acid 

25 sequence and tiierefore, possibly, a different protein stmcture. Ottier examples of 
possible modifications are insertion of one or more codons into the sequence, addition 
of one or more codons at eitiier end of the sequence and deletion of one or more 
cpdons at either end of or wntiiin the sequence. 

30 One aspect of the invention is a recombinant expression vector carrying any one of the 
expression casettes 
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5*-P-SP4.P-(PSHSHPSKgene*-(T)i-3' 
5'-P-«P4.P-PS.*gene*-0>3 
5'-P.SP-LP-S-PS.*gene*-(T)i-3' 
5'-P-SP-LP.PS-S-*gene*-(r)|-3' 

5'-P^P-LP-S-*gene*-(T)i-3' 
5'-P-SP-LP-*gene*-(T)i-3' 
5'-P-SP-LP-PS-S-PS-*gene*-(T)|-3' 

wherein P is a promoter sequence, SP, LP, PS, S, and *gene*, are as defined above, 
10 T is a suitable terminator, e.g. the TPI terminator (cf. Alber, T. and Kawasaki, G., J. 
Mol. Appl. Genet 1 (1982) 419-434), and i is 1 or 0. The vector may be any vector 
which is capable of replicating in yeast organisms. The promoter may be any DNA 
sequence which shows transcriptional activity in yeast and may be derived from genes 
encoding proteins either homologous or heterologous to yeast. The promoter Is 
15 preferably derived from a gene encoding a protein homologous to yeast Examples of 
suitable promoters for use in yeast host cells are the Saccharomvces cerevisiaQ 
MFa1, TPI. ADH, PGK promoters, or the yeast plasmid 2m replication genes REP 1-3 
and origin of replication. The vector may also comprise a selectable marker, e.g. the 
Schizosaccharomyces pombe TPI gene as described by Russell, P.R., Gene 40 
20 (1985)125-130. 

The expression vector of the invention may be any expression vector that is 
conveniently subjected to recombinant DNA procedures, and the choice of vector will 
often depend on the host cell into which the vector is to be introduced. Thus, the 
25 vector may be an autonomously replicating vector, i.e. a vector which exists as an 
extrachromosomal entity, the replication of which is independent of chromosomal 
replication, e.g. a plasmid. Alternatively, the vector may be one which, when 
introduced into a host cell, is integrated into the host cell genome and replicated 
together with the chromosome(s) into which it has been integrated. 

30 
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The methods used to ligate the sequence 5-P-SP-LS-PS-*gene*-(r)j-3' and to insert It 
into suitable yeast vectors containing the information necessary for yeast replication, 
are well Imown to persons skilled in the art (cf.. for instance, Sambrook, J., Fritsch, E.F. 
and Maniatis, T., op.ctt.) . It will be understood that the vector may be constructed either 
5 by first preparing a DMA construct containing the entire sequence 5'-P-SP-LS-PS- 
*gene*-(T)i-3* and subsequently inserting this fragment into a suitable expression 

vector, or by sequentially inserting DNA fragments into a suitable vertor containing 
genetic information for the individual elements (such as the promoter sequence, the 

signal peptkle, the leader sequence GInProlle(Asp/Glu)(Asp/Glu)Xl(Glu/Asp)x2 
10 AsnZ(Thr/Ser)x3, the processing site, the polypeptide, and. If present, the temiinator 
sequence) followed by ligation. 

in a further aspect, the present Invention . relates to a process for produdng a 
polypeptide (or protein) in yeast, the process comprising culturing a yeast cell, whteh is 

15 capable of expressing said polypeptide and whteh is transfonned with . a yeast 
expression vector as described above Including a leader peptide sequence of the 
Invention, in a suitable medium to obtain expression and secretion of the said 
polypeptide, after which ttie polypeptide is recovered from tiie medium. The temn 
"cuKuring" includes fennenting a yeast under laboratory and industrial conditions to 

20 produce the polypeptide of interest 

Yeasts are fungi of tiie class Ascomycetes, subclass Hemiascomycetidae. The yeast 
organism used in the metiiod of the invention may be any suitable yeast organism 
which, on cultivation, produces large amounts of ttie desired polypepti'de. Examples of 
25 suitable yeast organisms may be strains of tiie yeast species Saccharomyces 

cerevisiae. Sacdiaromyces kluyveri. Saccharomyces uvarum. Schizosaccharomyces 
pombe, Wuyveromyces lach's, Hanseniila polymorpha, Pichia pastoris, Pichia 
metiianolica, Pichia kluyveri, Yamowia lipolytica, Candida sp., Candida utilis. Candida 
cacaoi, Geotrichum sp., and Geotiichum fennentans. It is considered obvious for the 
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skilled person in tlie art to select any other fungal cell, such as cells of the genus 
Aspergillus, as the host organism. 

The transfomiation of the yeast cells may for instance be effected by protoplast 
5 formation followed by transformation In a manner known per se . The medium used to 
cultivate the cells may be any conventional medium suitable for growing yeast 
organisms. The secreted polypeptide, a significant proportion of which will be present 
in the medium in con'ectly processed fonn, may be recovered from the medium by 
conventional procedures including separating the yeast cells from the medium by 
10 centrifugation or filtration, precipitating the protelnaceous components of the 
supernatant or filtrate by means of a salt. e.g. ammonium sulphate, followed by 
purification by a variety of chromatographic procedures, e.g. ion exchange 
chromatography, affinity chromatography or the like. 

15 The invention is further described in the following examples which are not to be 
construed as limiting the scope of the invention as claimed. 

EXAMPLES 

20 

Construction of the yeast strain expressing the insulin precursor mediated by leaders 
lacking N-linked glycosylation. 

Synthetic genes coding for the leaders without amino acid sequences potential 
25 subjected to attachment of N-linked glycosylation in context with the insulin precursor 
with or witiiout N-terminal extension of N-tenninally extention was constructed using 
ttie Polymerase Chain Reaction (PGR). Oligonucleotides for PCR were syntiiesised 
using an automatic DNA synttiesizer (applied Biosystems model 380A) using 
phosphoramidite chemistry and commercially available reagents (Beaucage, S.L. and 
30 Camthers, M.H., Tetrahedron letters 22 (1981) 1859-1869). The PCR was performed 
using tine Pwo DNA or EHF Polymerase (Boehringer Mannheim GmbH, Sandhoefer 
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Strasse 116, Mannheim, Germany) according to the manufacture's institictions and the 
PGR mix was overlayed with 100 ul mineral oil (sigma Chemical CO, St. Louis MO, 
USA) 

PGR 

5 ^1 oligonucleotide (50 pmol) 

5 oligonucleotide (50 pmol) 

10 ul 1 0X PGR buffer 

8^ldNTPmix 

0.5 fil Pwo or EHF enzyme 

0.5 ^1 pAK680 plasmid as template (0.2 ug DNA) 

71 ^1 dest. water 

A total of 12 cycles were perfonned, one cyde was 94 C for 45 sec; 40 C for 1 min; 72 
C for 1.5 min. The PGR mixture was then loaded onto an 2.5% agarose gel and 
electrophoresis was perfonned using standard techniques ( Sambrook J, Fritsch El 
and Maniatis T, Molecular cloning. Gold Spring Harbour Laboratory press, 1989). The 
resulting DNA fragment was cut out of the agarose gel and isolated by the Gene Clean 
kit (Bio 101 inc.. PO BOX 2284, La Jolla. GA 92038, USA) according to the 
manufacturer's instructions. 

Certain leader DNA sequences were constructed by overiap PGR reaction as 
described by Horton, R.M, Gal, Z.. Ho, S.N. and Pease, L.R.: Gene splicing by overlap 
extension: tailor-made genes using the polymerase chain reaction. Biotechniques 8 
(1990) 528-535. 

The purified PGR DNA fragment was dissolved in Des. water and restriction 
endonudeases buffer and typically cut with the restriction endonucleases Bglll and 
Ncol according to standard techniques (Sambrrok J, Fritsch EF and Maniatis T, 
Molecular cloning. Cold Spring Harbour Laboratory press. 1989). The Nool-Xbal DNA 
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fragment on 209 nucleotide basepars was subjected to agarose electrophoresis and 
purified using The Gene Clean Kit as described. 

The expression plasmid pAK721 or a similar plasmid of the cPOT type (see Fig. 1) was 
5 typically cut with the restriction endonucleases Bglll and Xbal and the vector fragment 
of 1 0849 nucleotide basepairs isolated using The Gene Clean Kit as described. 

The typically plasmid pAK773 encoding the N-tenrninally extended EEGEPK-insulin 
precursor was cut with the restriction endonucleases Ncol and Xbal and the DNA 

10 fragment of 209 nucleotide basepars isolated using The Gene Clean [Qt as described. 
The three DNA fragments was ligated together using T4 DNA ligase and standard 
conditions (Sambrook J, Fritsch EF and Maniatis T, Molecular cloning. Cold spring 
Harbour laboratory press, 1989). The ligation mix was then transformed into a 
competent E. coli strain (R-, M+) followed by selection with ampiciilin resistance. 

15 Plasmid from the resulting E. coli was isolated using standard techniques (Sambroolc J, 
Fritsch EL and Maniatis T, Molecular cloning, Cold spring Harbour laboratory press, 
1989), and checked for insert w'rth. appropriate restriction endonudeases i.e. Bglll, 
EcoRI, Nco I and Xbal. The selected plasmid was shown by DNA sequence analysis 
(Sequenase, U.S. Biochemical Corp., USA) to encode the DNA sequence for the 

20 leader-MI3 insulin precursor DNA and the DNA encoding the leader to be inserted 
before the DNA encoding the MI3 insulin precursor DNA. 

An example on a DNA sequence pAK855 (SEQ ID No. 1) encoding the YAPS signal 
peptide - a leader without potential N-linked glycosylation sities. the TA57 leader, 
25 EEGEPK-MI3 insulin precursor complex are shown in Fig. 2. 

An example on a DNA sequence (SEQ ID No. 2) encoding the YAPS signal peptide- 
synthetic leader without potential N-tinked glycosylation sites, the TA69 leader, MIS 
insulin precursor without N4emiinally extension complex are shown in Fig. 3. 
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The yeast expression plasmids used are of the C-POT type (see Rg. 1 and 4) and are 
similar to those described in WO EP 171 142, which contain the Schizosaccharomyces 
pombe triose phosphate isomerase gene (POT) for plasmid selection and stabilisation 
in S.cerevisiae. pAK855 also contain the S. cerevisiae triose phosphate isomerase 
5 promoter and tenninator. The promoter and terminator are similar to those described In 
the plasmid pKFN1003 (described in WO 90/100075) as are all sequences in plasmid 
except the sequence between the EcoRI-Xbal fragment encoding the YAP3 signal 
peptide-leader without N-linked glycosylation-MI3 insulin precursor with or without N- 
terminaliy extension. 

10 

Purified LA34/IP fusion protein was processed by Sepharose-bound Achromobacter 
lyticus iysyl specific protease (EC 3.4.21 .50) to insulin desB30 (Fig. 5, Fig. 6). From the 
RP-HPLC analysis results the conversion yield for the removal of the LA34 leader from 
IP molecule In each collected sample was calculated and then plotted In a graph 

1 5 showing the conversion as a function of the reaction time. A curve for a first-order 
reaction reaching a (pseudo-)equllibrium can be fitted to the data points as shown in 
Pig. 5, Fig, 6. Electrospray mass spectrometry was peribnned on the proteinaceous 
material isolated from the two main peaks eluted by the RP-HPLC fractionation of the 
final reaction mixture. For the first eluting peak was found Mw of 5706 Da, 

20 corresponding to des(B30)-human insulin (calculated Mw. 5706 Da), and for the 
second peak was found a Mw of 5625 Da, con^sponding to the di-mannosylated 
I-A34-EEAEAEAEPK polypeptide lacking the dipeptlde QP (calculated Mw: 5627 Da) 
the QP dipeptlde presumably having been removed by the dipeptidyl aminopeptidase 
during secretion. This means that within the reaction time an almost complete cleavage 

25 of the precursor to an active desB30 insulin molecule has taken place. 
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CLAIMS 

1.A DNA construct encoding a polypeptide and haying the structure 
SP-LP-(PSHS)-(PS)-*gene*. 
5 wherein SP is a DNA sequence (presequence) encoding a signal peptide. LP Is a . 
DNA sequence encoding a synthetic leader peptide (propeptide) wherein N-linked 
glycosyiation is lacking, PS is a DNA sequence encoding a protease processing 
site which is optional in both positions, S is a DNA sequence encoding a spacer 
peptide which is optional, and *gene* is a DNA sequence encoding a polypeptkie. 
10 2. A DNA construct according to claim 1 , and having the structure 
SP-LP-PS-*gene*, 

wherein SP, LP, PS, and *gene* have the meanings defined above. 

3. A DNA constmct according to claim 2, which furthermore comprises a sequence 
encoding a spacer peptide located at the 5' end of *gene* and optionally 

15 comprises a sequence encoding a protease processing site located between the 
3' end of the sequence encoding said spacer peptide and the 5' end of said 
*gene* 

4. A DNA constmct according to any one of the preceding claims which is 
furtiiennore characterised in that O-linked glycosyiation of LP is lacking. . 

20 5. A DNA construct according to any one of claims 1, 2 and 3 which is furtfiermore 

characterised In LP having O-linked glycosyiation. 
6. A DNA construct according to any one of the preceding claims, characterised in 

that LP does not comprise the consensus N-linked glycosyiation sites NXT/S, 

wherein X designates any codable amino acid. 
25 7. A DNA construct according to any one of the preceding claims, whereiin SP is a 

DNA sequence selected from the group of DNA sequences encoding the S. 

cerevisiae a-factor signal peptide, the signal peptide of mouse salivary amylase, 

the yeast carboxypeptidase signal peptide, the yeast aspartic protease 3 signal 

peptide or ttie yeast BAR1 signal peptide. 
30 8, A DNA construct according to any one of the preceding claims, wherein LP is a 

DNA sequence encoding a leader peptide witii the general fomnula 1: 
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Q/SPIDDTESQTTSVNLMADDTESA/RFATYTXLDWN/GL(ISMA)/(PGA)KR (I) 
wherein 

X is a codabie amino acid or preferably a sequence of from 1 to 5 codable amino 
acids whicli may be the same or different, and is preferably selected from the 
group consisting of T,L.A,V,D,P.H.N.S,G. and Y is a codable amino acid selected 
from the group consisting of Q and N. 

9. A DNA construct according to daim 8. wherein Y is Q and X does not comprise S 
orT. 

10. A DNA construct according to any one of claims 1 to 7, wherein LS is a DNA 
sequence encoding a synthetic leader peptide with the general formula II: 

QPI.DD(A/D)E(AO)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(A/D)Q(A/D)PLALD 
WNLI(A/D)MAKR (||) 

Wherein (A/D) can be any codiable amino add. but preferably is alanine (A) or 
aspartic add (D). 

11. A DNA oonstmct according to any one of daims 1 to 7. wherein LS is a DNA 
sequence encoding a synthetic leader peptide with the general Ibnnula III: 
QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/b)VNLMADD(A/D)E(A/D)AFA(A/D)Q(A/D)PLAL 
DWNLI(A/D)MA(III) 

wherein (A/D) can be any codable amino add, but preferably is alanine (A), serine 
(S), or aspartic acid (D). 

12. A DNA construct according to any one of Jthe preceding claims, wherein X is 
selected from the sequences NA. TLA, DLA, PLA, TLAGG, TLADGG, TLADD, 
TLAGD, NSGG, TNSGG, and TSVGG. 

13. A DNA construd according to any one of the preceding daims, wherein the 
leader peptide coded for by the DNA sequence LP is selected fiiom the group 
comprising the sequences LA23, TA54, TA56. TA57, TA59, LA64, TA65, TA67, 
TA68. TA76, TA77, TA78, TA79, TA80, TA89, TA90, and TA101 of Table 1 
herein. 

14. A DNA construct according to any one of the preceding daims, wherein the 
leader peptide coded for by the DNA sequence LP is selected from the group 
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comprising the sequences TA75, TA75.50, TA75.15, TA75.4. TA75,51, TA75.58, 
and TA75.64 of table 1 a herein. 

15. A DNA construct according to any one of the preceding claims, wherein the 
leader peptide coded for by the DNA sequence LP is selected from the group 

5 comprising the sequences TA91, TA92. TA93, TA94, TA95, TA96. TA97. and 
TA98, ofTable 1b herein. 

16. A DNA construct according to any one of the preceding claims, wherein PS is a 
DNA sequence encoding an endoprotease processing site which allows in vivo 
processing. 

10 17. A DNA construct according to the preceding claim wherein the processing site is 
selected from DNA sequences encoding a dibasic processing site, preferably 
encoding the amino acid sequences KR, RK, RR, or KK. 

18. A DNA construct according to any one of the preceding claims, wherein PS is a 
DNA sequence encoding an endoprotease processing site which allows in vitro 

15 processing. 

19. A DNA construct according to the preceding claim wherein the processing site is 
selected from DNA sequences encoding a monobasic or dibasic processing site, 
preferably encoding the amino acid sequences K, R, or KR..RK. RR, or KK. 

20. A DNA constmct according to any one of the preceding claims, wherein the 
20 polypeptide is a polypeptide which is heterologous to yeast 

21 . A DNA construct according to the preceding daim, wherein the polypeptide is 
selected from the group consisting of aprotinin. tissue factor pathway inhibitor, or 
other protease inhibitors, insulin or insulin precursors, insulin-like polypeptides, 
such as insulin-like growth factor I and insulin-like growth factor II, human or 

25 bovine growth homnone, interieukin, glucagon, glucagon-like peptide 1, glucagon- 
like peptide II. GRPP, tissue plasminogen activator, transforming growth factor a 
or b, platelet-derived growth factor, enzymes, or a functional analogue thereof. 

22. A DNA construct according to claim 18, wherein the polypeptide is selected from 
the group consisting of insulin or insulin precursors, insulin-like polypeptides, such 

30 as insulin-like growth factor I and insulin-like growth factor .11. glucagon, glucagon- 
like peptide 1 , glucagon-like peptide 11, GRPP. or a functional analogue thereof. 
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23. A DNA construct according to any one of claims 1 to 17, wherein the polypeptide 
tei a polypeptide which is homologous to yeast 

24. A DNA construct according to the preceding claim, wherein the polypeptide is 
selected from the group consisting of the gene products of the KEX2 gene, and 
the YAP3 gene. 

25. A DNA construct according to any one of the preceding claims which furthermore 
comprises a promoter sequence located at the N-temiinal end of the structure SP- 
LP-PS-*gene*. 

26. A DNA constnjct according to any one of the preceding claims which furthemnore 
compnses a promoter sequence located at the N-temilnal end of the structure SP- 
LP-(PS)-(S).(PSKgene*. 

27. A DNA constnjct according to claim 25 and 26, wherein the promoter sequence 
is a yeast promoter sequence, preferably the TPI promoter. 

28. An expression cassette comprising the DNA construct according to daim 25, 
which additionally comprises a 5' tenninally located promoter sequence and 
a temilnator sequence (T), located at the 3' terminal of the structure SP-LP-PS- 
•gene*. where I is 0 or 1 . 

29. Ah expression cassette comprising the DNA construct according to daim 26, 
which additionally comprises a 5' tenninally located promoter sequence and 
a temilnator sequence (T), located at the 3* terminal of the structure SP-LP-(PS)- 
(S)-(PSKgene*, where I is 0 or 1 . 

30. An expression cassette according to daims 28 and 29, wherein i is 1 and T Is a 
DNA sequence encoding the TPI terminator. 

31. A yeast expression vedor comprising the DNA constoid according to any of the 
preceding dalrns. 

32. A yeast cell which is capable of expressing a polypeptide and which is 
transfomied with a yeast expression vector according to claim 31 . 

33. A yeast cell according to daim 32 seleded from the group consisting of 
Saccharomyces cerevisiae, Saccharomyces uvae. Saccharomyces kluyveri, 
Schizosaccharomyces pombe, Sacchoromyces uvarum. Kluyveromyces lactis, 
Hansenula polymoipha, Pichia pastoris, Pichia methanolica, Pichia kluyveri. 
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Yarrowia lipolytica, Candida sp., Candida utilis, Candida cacaoi, Geotiichum sp.. 
and Geotrichum fermentans. 

34. A process for producing a polypeptide in yeast, the process comprising culturing 
a yeast cell, which Is capable of expressing the desired polypeptide and which is 

5 transformed with a yeast expression vector according to claim 31, in a suitable 
medium to obtain expression and secretion of the polypeptide, after which the 
polypeptide is recovered from the medium. 

35. A process according to the preceding claim, wherein the yeast cell is selected 
from the group consisting of S. cerevisiae, Saccharomyces uvae, Saccharomyces 

10 kluyveri, Schizosaccharomyces pombe, Sacchoromyces uvarum, Kluyveromyces 
lactis, Hansenula polymorpha, Pichia pastoris, Pichia methanolica, Pichia kluyveri, 
Yan*owia lipolytica, Candida sp.. Candida utilis, Candida cacaoi, Geotrichum sp.. 
and Geotrichum femientans, preferably Saccharomyces cerevisiae. 

36. A DNA sequence encoding a synthetic prepro-leader peptide lacking the 
15 consensus N-linked glycosylation sites NXT/S, wherein X designates any codable 

amino acid which is not P. 

37. A DNA sequence according to the preceding claim selected from the group 
consisting of 
Q/SPIDDTESQTTSVNLMADDTESA/RFATYTXLDWN/GL(I^ (I) 

20 wherein 

X is a codable amino acid or preferably a sequence of from 1 to 5 codable amino 
acids which may be the same or different, and is preferably selected from the 
group consisting of T,L AV.D,P.H,N,S,G, and Y is a codable amino acid selected 
from the group consisting of Q and N, and wherein the C-terminal KR is an 
25 optional processing site. 

38. A DNA sequence according to the preceding claim selected from the group 
consisting of LA23, TA54. TA56. TA57, TA59, LA64. TA65, TA67. TA68, TA76. 
TA77, TA78. TA79, TA80, TA89, TA90, and TA101 of Table 1 herein. 

39. A DNA sequence according to daim 36 selected from the group consisting of 
30 QPIDD(A/D)E(A/D)Q{A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(^ 

WNLI(A/D)MAKR (II) 
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wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 
(S), or aspartic acid (D), and wherein the C-terminal KR is an optional processing 
site. 

40. A DNA sequence according to the preceding claim selected from the group 
consisting of TA75. TA75.50. TA75.15, TA75.4, TA75.51. TA75.58. and TA75.64 
ofTablelaherein. 

41; A DNA sequence according to claim 36 selected fiom the group consisting of 
QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(A/D)Q(^^ 
DWNLI(A/D)MA (III) 
wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 
(S), or aspartic acid (D). 

42. A DNA sequence according to the preceding daim selected from the group 
consisting of TA91. TA92. TA93. TA94. TA95. TA96. TA97. and TA98, of Table 
ib herein. 

43. A synthetic prepro-leader peptide lacking the consensus N-Hnked glycosylation 
sites NXT/S, wherein X designates any codable amino acid which is not P. 

44. A synthetic prepro-leader peptide according to the preceding claim selected from 
the group consisting of 

Q/SPIDDTESQTTSVNLMADDTESA/RFATYTXLDWN/GL(ISMA)/(PGA)KR (I) 
wherein 

X is a codable amino acid or preferably a sequence of from 1 to 5 codable amino 
acids which may be the same or different and is preferably selected from the 
group consisting of T,LAV,D.P,H,N,S,G, and Y is a codable amino acid selected 
from ttie group consisting of Q and N, and wherein the Oterminal KR is an 
optional processing site. 

45. A synthetic prepro-leader peptide according to tiie preceding claim selected from 
the group consisting of LA23. TA54, TA56. TA57. TA59, LA64. TA65, TA67, 
TA68. TA76. TA77, TA78, TA79. TA80. TA89, TA90, and TA101 of Table 1 
herein. 

46. A synthetic prepro-leader peptide according to claim 36 selected from the group 
consisting of 
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QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)^ 
VVNLI(A/D)MAKR(II) 

wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 
(S), or aspartic acid (D), and wherein the C-terminal KR is an optional processing 
5 site. 

47. A a synthetic prepro-leader peptide according to the preceding claim selected 
from the group consisting of TA75, TA75.50, TA75.15. TA75.4, TA75.51. TA75.58. 
and TA75.64 of Tablel a herein. 

48. A synthetic prepro-leader peptide according to claim 36 selected from the group 
10 consisting of 

QPIDD(A/D)E(AA3)Q(A/D)(A/D)(A/D)VNLMADD(A/D)E(A/D)AFA(A/D 
DWNU(A/D)MA (III) 
wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 
(S), or aspartic acid (D). 
15 49. A synthetic prepro-leader peptide according to the preceding claim selected from 
the group consisting of TA91 . TA92, TA93, TA94, TA95, TA96, TA97. and TA98. 
of Table lb herein. 

50. The use of a first DNA sequence encoding a synthetic prepro-leader lacking N- 
linked glycosylation sites for secretion of a protein in fungal ceils, such as yeast 

20 cells. 

51. Use according to the preceding claim wherein said prepro-leader additionally 
lacks 0-linked glycosylation sites. 

52. Use according to any of claims 36 to 37, wherein said synthetic prepro-leader 
has an amino acid sequence selected from the group consisting of 

25 Q/SPIDDTESQTTSVNLMADDTESA/RFATmLDWN/GL(ISMA)/(PGA)KR (I) 
wherein 

X is a codable amino acid or preferably a sequence of from 1 to 5 codable amino 
acids which may be the same or different, and is preferably selected from the 
group consisting of T,L,A,V,D,P,H,N,S,G, and Y is a codable amino acid selected 
30 from the group consisting of Q and N, and wherein the C-temninal KR is an 
optional processing site. 
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53. Use according to the preceding claim wherein said prepro-leader is selected from 
the group consisting of LA23. TA54, TA56, TA57, TA59, LA64. TA65, TA67, 
TA68, TA76. TA77, TA78, TA79, TA80. TA89. TA90, and TA101 of Table 1 
herein. 

5 54. Use according to any of claims 36 to 37, wherein said synthetic prepro-ieader 
has ah amino acid sequence selected from the group consisting of 
QPIDD(A/D)E(A/DP(A/D)(A/D)(A/D)VNLMADD(AC))E(^ 

WNLI(A/D)MAKR (||) 
wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 
10 (S), or aspartic add (D), and wherein the C-terminal KR is an optional processing 
site. 

55. Use according to the preceding claim wherein said prepro-leader is selected from 
the group consisting of TA75, TA75.50. TA75.15, TA75.4, TA75.51, TA75.58, and 
TA75.64 of Tablel a herein. 

15 56. Use according to the preceding claim wherein said synthetic prepro-ieader has 
an amino acid sequence selected from the group consisting of 
QPIDD(A/D)E(A/D)Q(A/D)(A/D)(A/D)VNLI\4ADD(A/D)E(A/D)AFA(^^ 
DWNLI(A/D)MA (III) 
wherein (A/D) can be any codable amino acid, but preferably is alanine (A), serine 

20 (S), or aspartic add (D). 

57. Use according to the preceding claim wherein said synthetic prepro-leader has 
an amino acid sequence selected from the group consisting of TA91, TA92, TA93, 
TA94, TA95. TA96, TA97, and TA98, of Table 1b herein. 

58. Use according to any of claims 36 to 43, wherein said protein is encoded by a 
25 second DNA sequence fused at the 5' end to said first DNA sequence encoding 

said prepro-leader. 

59. Use according to the preceding claim wherein a third DNA sequence encoding a 
spacer peptide optionally having one or more processing sites is inserted in frame 
between the 3* end of said first DNA sequence encoding isaid prepro-leader and 

30 the 5' end of said second DNA sequence encoding said protein. 
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60. Use according to the preceding claim wherein the DNA sequence encoding said 
spacer peptide is selected from DNA sequences encoding an oligopeptide having 
1 to 12 amino acid residues, such as EEAEPK, EEGEPK, E(EA)3EPK, EEPK. 

61. Use according to any of claims 36 to 46, wherein said protein is a heterologous 
5 protein. 

62. Use according to the preceding claim wherein said protein is selected from the 
group consisting of aprotinin. tissue factor pathway inhibitor, or other protease 
inhibitors, insulin or insulin precursors, insulin-liice polypeptides, such as insulin- 
like growth factor I and insulin-like growth factor II. human or bovine growth 

10 honrione, interieukin, glucagon, glucagon-like peptide 1, glucagon-like peptide II, 
GRPP, tissue plasminogen activator, transforming growth factor a or b. platelet- 
derived growth factor, enzymes, or a functional analogue thereof. 

63. Use according to any of claims 36 to 48 wherein said protein is insulin or an 
insulin precursor. 

15 64. Use according to any of claims 36 to 46. wherein said protein is a homologous 
protein, preferably selected finom the group consisting of the gene products of the 
yeast KEX2 and YAP3 genes. 
65. Use according to any of claims 36 to 50 wherein said yeast is selected from the 
group consisting of S. cerevisiae, Saccharomyces uvae, Saccharomyces kluyveri. 

20 Schizosaccharomyces pombe, Sacchoromyces uvarum, Kluyveromyces lactis, 
Hansenula polymorpha, Pichia pastoris, Pichia methanolica, Pichia kluyveri, 
Yarrowia lipolytica, Candida sp., Candida utilis, Candida cacaoi, Geotrichum sp., 
and Geotrichum fermentans, preferably Saccharomyces cerevisiae. 
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EeoRI 



901 TTCTTGCTTA AATCTATAAC TACAAAAAAC ACATACAGGA ATTCCATTCA 
AAGAACGAAT TTAGATATTG ATGTTTTTTG TGTATGTCCT TAAGGTAAGT 

951 AGAATAGTTC AAACAAGAAG ATTACAAACT ATCAATTTCA TACACAATAT 
TCTTATCAAG TTTGTTCTTC TAATGTTTGA TAGTTAAAGT ATGTGTTATA 

+1 MKLKTVRSAVLS 

B^lII 



1001 AAACGATTAA AAGAATGAAA CTGAAAACTG TAAGATCTGC GGTCCTTTCG 
TTTGCTAATT TTCTTACTTT GACTTTTGAC ATTCTAGACG CCAGGAAAGC 

+1SLFA SQV LGQ PIDD TES 

Styl 

1051 TCACTCTTTG CATCTCAGGT CCTTGGCCAA CCAATTGACG ACACTGAATC 
AGTGAGAAAC GTAGAGTCCA GGAACCGGTT GGTTAACTGC TGTGACTTAG 

+1 QTT SVNL MAD DTE S A F 
1101 TCAAACTACT TCTGTCAACT TGAIGGCTGA CGACACTGAA TCTGCTTTCG 
AGTTTGATGA AGACAGTTGA ACTACCGACT GCTGTGACTT AGACGAAASC 

+1ATQT NSG GLDV VGL ISM 

Styl 

Hcol 

1151 CTACTCAAAC TAACTCTGGT GGTTTGGATG TTGTTGGTTT GATCTCCATG 
GATGAGTTTG ATTGAGACCA CCAAACCTAC AACAACC7VAA CTAGAGGTAC 

+1AK RE EGE PKF VNQH LOG 
Styl 

Ncol 

1201 GCTAAGAGAG AAGAAGGTGA ACCAAAGTTC GTTAACCAAC ACTTGTGCGG 
CGATTCTCTC TTCTTCCACT TGGTTTCAAG CAATTGGTTG TGAACACGCC 

+1 SHL VEAL YLV CGE RGF 
Hindlll 



1251 TTCCCACTTG GTTGAAGCTT TGTACTTGGT TTGCGGTGAA AGAGGTTTCT 
AAGGGTGAAC CAACTTCGAA ACAT6AACCA AACGCCACTT TCTCCAAAGA 

+1FYTP KAA KGIV EQC CTS 
BSU36I 



1301 TCTACACTCC TAAGGCTGCT AAGGGTATTG TCGAACAATG CTGTACCTCC 
AGATGTGAGG ATTCCGACGA TTCCCATAAC AGCTTGTTAC GACATGGAGG 

+IXCSI1 YQL ENY CN* 
1351 ATCTGCTCCT TGTACCAATT GGAAAACTAC TGCAACTAGA CGCAGCCCGC 
TAGACGAGGA ACATGGTTAA CCTTTTGATG ACGTTGATCT GCGTCGGGCG 

XbaX 



1401 AGGCTCTAGA AACTAAGATT AATATAATTA TATAAAAATA TTATCTTCTT 
TCC6AGATCT TTGATTCTAA TTATATTAAT ATATTTTTAT AATAGAAGAA 
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901 TTCTTGCTTA AATCTATAAC TACAAAAAAC ACATACAGGA ATTCCATTCA 
AAGAACGAAT TTAGATATTG ATGTTTTTTG TGTATGTCCT TAAGGTAAGT 

951 AGAATAGTTC AAACAAGAAG ATTACAAACT ATCAATTTCA TACACAATAT 
TCTTATCAAG TTTGTTCTTC TAAT6TTTGA TAGTTAAAGT ATGTGTTATA 

+1 MKLKTVRSAVLS 

BgllZ 



1001 AAACGATTAA AAGAATGAAA CTGAAAACTG TAAGATCTGC GGTCCTTTCG 
TTTGCTAATT TTCTTACTTT GACTTTTGAC ATTCTAGACG CCAGGAAAGC 

+lSLrA SQV LGQ PIDD TES 

styl 

1051 TCACTCTTTG CATCTCAGGT CCTTGGCCAA CCAATTGACG ACACTGAATC 
AGTGAGAAAC GTAGAGTCCA GGAACCGGTT GGTTAACTGC T6TGACTTAG 

+1 QTT SVML MAD DTE SAF 
1101 TCAAACTACT TCTGTCAACT TGATGGCTGA CGACACTGAA TCTGCTTTC6 
AGTTTGATGA AGACAGTTGA ACTACCGACT GCTGTGACTT AGACGAAAGC 

+1ATQT NSG GLDV VGL PGA 
1151 CTACTCAAAC TAACTCTGGT GGTTTGGATG TTGTTGGTTT GCCAGGTGCT 
GATGAGTTTG ATTGAGACCA CCAAACCTAC AACAACCAAA CGGTCCACGA 

+1KRF V NQH LCG SHLV EAL 

KindllZ 



1201 AAGAGATTCG TTAACCAACA CTTGTGCGGT TCCCACTTGG TTGAAGCTTT 
TTCTCTAAGC AATTGGTTGT GAACACGCCA AGGGTGAACC AACTTCGAAA 

+1 YLV CGER GFF YTP KAA 

BSU36I 



1251 GTACTTGGTT TGCGGTGAAA GAGGTTTCTT CTACACTCCT AAGGCTGCTA 
CATGAACCAA ACGCCACTTT CTCCAAAGAA 6ATGTGAGGA TTCCGACGM 

+1KGIV EQC CTSI CSL YQL 
1301 AGGGTATTGT CGAACAATGC TGTACCTCCA TCTGCTCCTT GTACCAATTG 
TCCCATAACA GCTTGTTACG ACATGGAGGT AGACGAGGAA CAIGGTTAAC 

+1 E N Y C N * 

Xbal 



1351 GAAAACTACT GCAACTAGAC GCAGCCCGCA GGCTCTAGAA ACTAAGATTA 
CTTTTGATGA CGTTGATCTG CGTCGGGCGT CCGAGATCTT TGATTCTAAT 
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